Applying morphological decomposition to statistical machine translation
نویسندگان
چکیده
This paper describes the Aalto submission for the German-to-English and the Czechto-English translation tasks of the ACL 2010 Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR. Statistical machine translation has focused on using words, and longer phrases constructed from words, as tokens in the system. In contrast, we apply different morphological decompositions of words using the unsupervised Morfessor algorithms. While translation models trained using the morphological decompositions did not improve the BLEU scores, we show that the Minimum Bayes Risk combination with a word-based translation model produces significant improvements for the Germanto-English translation. However, we did not see improvements for the Czech-toEnglish translations.
منابع مشابه
Segmentation for English-to-Arabic Statistical Machine Translation
In this paper, we report on a set of initial results for English-to-Arabic Statistical Machine Translation (SMT). We show that morphological decomposition of the Arabic source is beneficial, especially for smaller-size corpora, and investigate different recombination techniques. We also report on the use of Factored Translation Models for Englishto-Arabic translation.
متن کاملMinimum Bayes Risk Combination of Translation Hypotheses from Alternative Morphological Decompositions
We describe a simple strategy to achieve translation performance improvements by combining output from identical statistical machine translation systems trained on alternative morphological decompositions of the source language. Combination is done by means of Minimum Bayes Risk decoding over a shared Nbest list. When translating into English from two highly inflected languages such as Arabic a...
متن کاملApplying Morphological Decompositions to Statistical Machine Translation
This paper describes the Aalto submission for the German-to-English and the Czechto-English translation tasks of the ACL 2010 Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR. Statistical machine translation has focused on using words, and longer phrases constructed from words, as tokens in the system. In contrast, we apply different morphological decompositions of words ...
متن کاملMyanmar Phrases Translation Model with Morphological Analysis for Statistical Myanmar to English Translation System
This paper presents Myanmar phrases translation model with morphological analysis. The system is based on statistical approach. In statistical machine translation, large amount of information is needed to guide the translation process. When small amount of training data is available, morphological analysis is needed especially for morphology rich language. Myanmar language is inflected language...
متن کاملApplying Morphology Generation Models to Machine Translation
We improve the quality of statistical machine translation (SMT) by applying models that predict word forms from their stems using extensive morphological and syntactic information from both the source and target languages. Our inflection generation models are trained independently of the SMT system. We investigate different ways of combining the inflection prediction component with the SMT syst...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010